Skip to content

Conversation

@yperess
Copy link
Contributor

@yperess yperess commented Oct 7, 2022

Introduce a shim layer for platforms that do not have official support
in cmsis-dsp via a custom "core" that exposed to the cmsis-dsp library
by adding the "cmsis_compiler.h" file to the include path.

@yperess
Copy link
Contributor Author

yperess commented Oct 7, 2022

Note: I didn't implement mul/div yet because I wanted to make sure we're on the same page.

@yperess yperess requested review from teburd and removed request for Mani-Sadhasivam October 7, 2022 06:00
Copy link
Member

@stephanosio stephanosio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem to be what we settled on in the chat:

the idea is basically to define "Zephyr DSP API" [...] that is 1-to-1 mapping to CMSIS-DSP.

why do we have to reinvent the API when there is CMSIS-DSP?

We should not be using architecture specific libraries beyond the architecture domain itself.

it is not architecture specific. as I already explained, the library in itself provides pure C implementations of all functions, so it can run on any architectures as is.

let's take a look at "Basic Math" functions for instance:
https://github.com/ARM-software/CMSIS-DSP/blob/main/Include/dsp/basic_math_functions.h
there really is nothing ARM-specific in the "API" functions.

https://github.com/misaleh/CMSIS-DSP-PULPino
https://github.com/Nuclei-Software/NMSIS/tree/master/NMSIS/DSP
examples of other RISC-V vendors forking CMSIS-DSP for their use
just proves my point that this is the most natural way forward.

and we do have a full testsuite for all the CMSIS-DSP functions so verification is going to be less of a hassle.

So, unless you plan to document all these functions and add full comprehensive test suites for all of them, please keep the API identical to that of the CMSIS-DSP.

Also, remember that the functions you have implemented here is just a subset of the "Basic Math" functions -- there are hundreds more in the CMSIS-DSP and deviating from the CMSIS-DSP is not going to be a sustainable approach unless we have someone working on this full time.

@yperess
Copy link
Contributor Author

yperess commented Oct 7, 2022

Updated to use CMSIS-DSP directly via a shimming "core". Note that this will fail to build on some platforms until zephyrproject-rtos/cmsis#20 is merged.

@yperess yperess marked this pull request as ready for review October 7, 2022 19:41
@zephyrbot zephyrbot requested review from galak and povergoing October 7, 2022 19:41
@yperess yperess changed the title math: Add a new DSP subsystem with configurable backend math: enable using cmsis-disp basicmath for all platforms Oct 7, 2022
@yperess yperess requested a review from stephanosio October 7, 2022 19:41
@stephanosio
Copy link
Member

Updated to use CMSIS-DSP directly via a shimming "core". Note that this will fail to build on some platforms until zephyrproject-rtos/cmsis#20 is merged.

Please pull the CMSIS-DSP PR here then.

@zephyrbot zephyrbot added manifest west DNM This PR should not be merged (Do Not Merge) labels Oct 10, 2022
@zephyrbot
Copy link

zephyrbot commented Oct 10, 2022

The following west manifest projects have been modified in this Pull Request:

Name Old Revision New Revision Diff
cmsis zephyrproject-rtos/cmsis@093de61 zephyrproject-rtos/cmsis@74981bf (master) zephyrproject-rtos/[email protected]

Note: This message is automatically posted and updated by the Manifest GitHub Action.

@zephyrbot zephyrbot requested a review from gmarull November 25, 2022 06:57
@yperess
Copy link
Contributor Author

yperess commented Nov 25, 2022

I'm ok with handling the extensions for ARC separately (see #52478), but we'll have to resolve the discussions within this release to avoid introducing an API that gets modified in a next release again.

I asked about the __restrict and input buffer re-use with our DSP experts. Despite the semantics of __restrict, it doesn't mean that things are guaranteed to be broken when input buffers are reused for outputs. For simple 'a = a + b' cases there should not be a problem even with __restrict. Typically whether this is allowed or not is determined more by the algorithm nature and C code then by the compiler handling __restrict optimizations.

I don't know, I'll need to test a few things. Here's a great example of the optimization that can be done with restrict https://en.m.wikipedia.org/wiki/Restrict#Optimization. To me it seems like adding that attribute is risky, but I'll let others weigh in.

@yperess
Copy link
Contributor Author

yperess commented Nov 28, 2022

@ruuddw @SiyuanCheng-CN @teburd @stephanosio

Upon further review and consulting with some of our compiler guys I've reached the following conclusions:

RE: __restrict

We should not allow __restrict as a pointer attribute in this API. By definition it means that the behavior of a += b is undefined and can break with different compilers or even versions of the same compiler.

RE: __agu

This seems perfectly reasonable, though I would like it to go in a separate CL so we can have a focused discussion on how are we going to add architecture specific attributes to this API (which I assume we'll be running into quite a lot and this will be a long discussion that I don't want my other work to be blocked on).

@ruuddw
Copy link
Member

ruuddw commented Nov 28, 2022

I agree with the __restrict 'risk' and the AGU is obviously not relevant for all architectures/implementations. Therefore the proposal is to add these via macro's that user can configure/define: if not applicable for a certain implementation backend, the preprocessor removes them.
The main discussion I think that needs to be resolved (independent of interface implementations) is the semantics that we want. Should a = math_op(a,b) and b = math_op(a,b) both be supported by any implementation backend? Or only reuse of the first (or second)? Should it be supported for any math_op, or only for simpler ones?
My preference would be to have the interface semantics defined more strict (i.e. adhering to our optimized __restrict), but if the majority prefers the semantics to be defined including in-place/argument aliasing support we will find other ways to have optimal DSP performance and compatibility with the ARC dsp library for our users.

@yperess
Copy link
Contributor Author

yperess commented Nov 28, 2022

I agree with the __restrict 'risk' and the AGU is obviously not relevant for all architectures/implementations. Therefore the proposal is to add these via macro's that user can configure/define: if not applicable for a certain implementation backend, the preprocessor removes them. The main discussion I think that needs to be resolved (independent of interface implementations) is the semantics that we want. Should a = math_op(a,b) and b = math_op(a,b) both be supported by any implementation backend? Or only reuse of the first (or second)? Should it be supported for any math_op, or only for simpler ones? My preference would be to have the interface semantics defined more strict (i.e. adhering to our optimized __restrict), but if the majority prefers the semantics to be defined including in-place/argument aliasing support we will find other ways to have optimal DSP performance and compatibility with the ARC dsp library for our users.

That seems fair, I guess the tradeoff we're looking at is:

Option 1

Disallow in-place DSP operations. Which will allow for __restrict and generally require more memory in places where in-place arithmetic would have been used.

Option 2

Disallow __restrict optimization thus making in-place operations possible (potentially saving some memory) at the cost of some performance.

@ruuddw
Copy link
Member

ruuddw commented Nov 29, 2022

Nice summary of the options, that's indeed what I think needs to be decided. For a generic API, my vote is to have the stricter semantics. But we probably should get some more inputs on the user side: is there a lot of application code that assumes in-place processing is supported? I'm also not sure if CMSIS-DSP defines this explicitly.

@yperess
Copy link
Contributor Author

yperess commented Nov 29, 2022

Nice summary of the options, that's indeed what I think needs to be decided. For a generic API, my vote is to have the stricter semantics. But we probably should get some more inputs on the user side: is there a lot of application code that assumes in-place processing is supported? I'm also not sure if CMSIS-DSP defines this explicitly.

I posted this last night, I think there's a third option for a Kconfig, but it makes things a bit more complex.
#52617

@yperess yperess requested a review from stephanosio November 30, 2022 16:34
teburd
teburd previously approved these changes Dec 1, 2022
Copy link
Contributor

@teburd teburd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, should add the experimental status to the API overview.rst and a minimal subsys doc that gives a brief overview of the intent here.

https://github.com/zephyrproject-rtos/zephyr/blob/main/doc/develop/api/overview.rst

In my mind this is fine for experimental status, with the expectation that some things may change, potentially in breaking ways still.

Copy link
Member

@stephanosio stephanosio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost ready, just a minor change for the test names.

Comment on lines +47 to +48
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note to myself (or anyone who might be interested in maintaining this in the future) that CONFIG_FP16 is currently an ARM-specific config and __fp16 is an ARM-specific type.

The CONFIG_FP16 Kconfig should eventually be moved to the same level as CONFIG_FPU and the architecture-agnostic ISO _Float16 type support should be implemented.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without knowing the details myself: is there a reason for not using the ISO _Float16 already now then? Is it semantically different from __fp16?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On ARM, the __fp16 type can represent a half-precision floating point value in either IEEE 754-2008 format or the "ARM alternative format" depending on the specified compiler options -- we do not want to take away that capability for now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the tests to tests/subsys/dsp/ as they'll be a part of the Zephyr
dsp subsystem.

Signed-off-by: Yuval Peress <[email protected]>
@teburd
Copy link
Contributor

teburd commented Dec 1, 2022

A few compliance issues look legit

Yuval Peress added 2 commits December 1, 2022 12:35
Introduce an API mirroring the CMSIS-DSP's basicmath. If CMSIS_DSP is
enabled, then it will by default be used as a backend. Developers may
opt into a custom backend by setting CONFIG_DSP_BACKEND_CMSIS=n. If
done, the application must provide `zdsp_backend/dsp.h` and optionally
implement the functions in its own .c files.

Signed-off-by: Yuval Peress <[email protected]>
Update the relevant tests in basicmath to use the dsp subsystem. Note
that f16 is not updated since it's much more architecture specific and
did not become a part of zdsp.

Signed-off-by: Yuval Peress <[email protected]>
menuconfig CMSIS_DSP
bool "CMSIS-DSP Library Support"
depends on (CPU_CORTEX && NEWLIB_LIBC) || ARCH_POSIX
depends on NEWLIB_LIBC || ARCH_POSIX
Copy link
Contributor

@teburd teburd Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker, but a note, depends on NEWLIBC should be revisited when PICOLIBC becomes the thing

@carlescufi carlescufi merged commit 041e853 into zephyrproject-rtos:main Dec 2, 2022
@yperess yperess deleted the peress/cmsis branch December 2, 2022 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants